- 
                Notifications
    
You must be signed in to change notification settings  - Fork 292
 
Allow the opensearch operator to watch multiple namespaces #1101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow the opensearch operator to watch multiple namespaces #1101
Conversation
710693f    to
    4cca417      
    Compare
  
    | 
           @prudhvigodithi @swoehrl-mw Greetings! I'm an SRE with the Wikimedia Foundation and I work with @brouberol . We're rolling out a new OpenSearch environment on K8s in the next month or so and I was wondering if y'all had the cycles to review this change? There's more context in our task tracker if y'all are interested. Thanks for taking a look and feel free to ping here or in OpenSearch Slack if you have any questions or comments.  | 
    
| 
           Adding @patelsmit32123 @synhershko to please take a look and add your thoughts.  | 
    
          
 FWIW we are planning a massive release of a 3.0 version of this operator which will be significantly better and safer to use in production.  | 
    
| 
           What is the use-case for doing what you are doing here? I might be missing something  | 
    
          
 The use case is mostly deployment convenience (only having to deploy a single operator cluster-wide), as well as aligning with common operator behavior within the Kubernetes ecosystem. For example, having a single operator being able to watch multiple operators is supported by: 
 Note: We're not actively running all of these operators (only a subset), but I sampled actively maintained operator codebases and documentation to showcase that this is a common behavior. My point (and IMHO the general sentiment over at #374) is that this behavior is expected by operator users and deployers, as it's become quite standard. It is something that is natively supported by the operator SDK, and does not take anything away from the current operator behavior, as it only adds the ability to have the operator manage one to many clusters, instead of a single one at the moment. I hope it clears things up. Note:: I just realized that for this patch to be complete, it's lacking iterating over watched namespaces to setup the appropriate roles and role bindings in each of them. I'm happy to send that work over if the feature request intention is approved.  | 
    
| 
           Got it. Happy to merge this once conflicts are resolved and CI is green.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- First, please resolve conflicts.
 - Second,
opensearch-k8s-operator/charts/opensearch-operator/values.yaml
Lines 119 to 123 in ccd97c6
## If this is set to true, RoleBindings will be used instead of ClusterRoleBindings, inorder to restrict ClusterRoles ## to the namespace where the operator and OpenSearch cluster are in. In that case, specify the namespace where they ## are in in manager.watchNamespace field. ## If false, ClusterRoleBindings will be used useRoleBindings: false 
This is not directly linked to your changes but good to have in the same scope.
useRoleBindingsshould be enabled only whenwatchNamespaceis equal to the release namespace. If not, then we need to useclusterRoleBindings.
Please update the helm doc ofuseRoleBindingspart properly and fix typo in the current one. 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a critical issue when watchNamespace is not set (empty string).
We keep the original `-watch-namespace` flag, to ensure backwards compatibility. We simply split the value over any comma, and populate the cache for each namespace in the csv. Note: Because the `watchNamespace` variable was being tested for emptiness _before_ `flag.Parse()` was being called, it was always empty, causing the operator to _always_ watch all namespaces in the cluster. This is no longer the case. Fixes opensearch-project#374 Signed-off-by: Balthazar Rouberol <[email protected]>
| 
           @josedev-union I'm confused by the  The chart defines 2 ClusterRoles: 
 If we define  Cf https://kubernetes.io/docs/reference/access-authn-authz/rbac/#rolebinding-and-clusterrolebinding 
 (emphasis mine) The way I see if, the logic should be to use RoleBindings as soon as  WDYT?  | 
    
4cca417    to
    e11f38f      
    Compare
  
    
          
 yes, we need to use clusterRoleBinding in such cases.  | 
    
| 
           So, this is where I'm not sure I agree (or maybe we agree and I'm just misunderstanding). If you watch specific namespaces, by having a non-empty  The RoleBinding / ClusterRoleBinding decision should be: flowchart TD
    B{Is it manager.watchNamespace empty?}
    B -- Yes --> C[Use a ClusterRoleBinding to give permissions to the operator on *all* namespaces]
    B -- No ----> D[Use a RoleBinding on the operator ClusterRoleBinding in each of the watched namespaces]
    Happy to hear your thoughts  | 
    
| 
           @brouberol nah, it is much more than my original thought. :)  | 
    
| 
           Ok, I think I pinpointed where our lack of understanding is coming from. Let me know if I get this right. The opensearch operator is usually deployed in the same namespace than the opensearch cluster. When that is the case, we can use a  In that light, I understand your comment: having  What I was pushing towards though, is considering that the operator can be run from its own namespace (as it is common in the ecosystem to run operator wither from  classDiagram
    ClusterRole <|-- RoleBindingNS1
    ClusterRole <|-- RoleBindingNS2
    
    class ClusterRole {
        name: opensearch-operator-manager-role
        permissions: [...]
    }
    class RoleBindingNS1 {
        namespace: ns1
        ---
        roleRef.apiGroup: rbac.authorization.k8s.io
        roleRef.kind: ClusterRole
        roleRef.name: opensearch-operator-manager-role
        subjects[0].kind: ServiceAccount
        subjects[0].name: opensearch-operator
        subjects[0].namespace: opensearch-operator
    }
    class RoleBindingNS2 {
        namespace: ns2
        ---
        roleRef.apiGroup: rbac.authorization.k8s.io
        roleRef.kind: ClusterRole
        roleRef.name: opensearch-operator-manager-role
        subjects[0].kind: ServiceAccount
        subjects[0].name: opensearch-operator
        subjects[0].namespace: opensearch-operator    }
    The main reason to do this would be to avoid granting the opensearch-operator to read/write/delete/update  I'm happy to get told "let's punt this to another PR" and I'll just update the comment. However, let's be aware than it its current state, the operator permissions are wide open. Linking back to #374 original message, we see 
 This is the security issue they're mentioning.  | 
    
          
 I prefer to open a new issue for this.  | 
    
Description
We allow the opensearch-operator to watch multiple namespaces.
We keep the original
-watch-namespaceflag, to ensure backwards compatibility. We simply split the value over any comma, and populate the cache for each namespace in the csv.Note: Because the
watchNamespacevariable was being tested for emptiness beforeflag.Parse()was being called, it was always empty, causing the operator to always watch all namespaces in the cluster. This is no longer the case.I have added documentation in the user guide as well as in the chart values.
Because this change occurs in
main.go, for which we don't have unit tests, I'll enclose my manual test notes.Testing
We first rebuild the operator binary.
We ensure that the new behavior is now available.
We run the operator alongside a local minikube.
We define a namespace-less cluster resource
~/c/opensearch-k8s-operator/opensearch-operator watch-multiple-ns ?1 ❯ cat cluster.yamlWe create 3 namespaces
We now create an opensearch cluster in
ns1We start seeing activity in the operator logs
We now create an opensearch cluster in
ns2:We start seeing activity in the operator logs, this time related to the cluster in
ns2We finally create a cluster in
ns3:This time, no log related to the cluster in
ns3is observed in the controller logs.Chart changes
I render the chart using the default values. The output does not contain the
-watch-namespaceflag.I then inject either a single or multiple namespaces to watch, either in a csv or in a list, to ensure that the rendering is correct:
Issues Resolved
Closes #374
Check List
make lint)If CRDs are changed:
make manifests) and also copied into the helm chartPlease refer to the PR guidelines before submitting this pull request.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.